AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Knowledge Distilled BERT

# Knowledge Distilled BERT

Bert L12 H240 A12
A variant of the BERT model pre-trained using knowledge distillation technology, with a hidden layer dimension of 240 and 12 attention heads, suitable for masked language modeling tasks.
Large Language Model Transformers
B
eli4s
7
2
Bert L12 H256 A4
A lightweight BERT model pretrained using knowledge distillation techniques, with a hidden layer dimension of 256 and 4 attention heads, suitable for masked language modeling tasks.
Large Language Model Transformers
B
eli4s
17
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase